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ABSTRACT 

Background: The term continuous quality improvement 
(CQI) is often used to refer to a method for improving 
care, but no consensus statement exists on the 
definition of CQI. Evidence reviews are critical for 
advancing science, and depend on reliable definitions 
for article selection. 

Methods: As a preliminary step towards improving CQI 
evidence reviews, this study aimed to use expert panel 
methods to identify key CQI definitional features and 
develop and test a screening instrument for reliably 
identifying articles with the key features. We used 
a previously published method to identify 106 articles 
meeting the general definition of a quality improvement 
intervention (Qll) from 9427 electronically identified 
articles from PubMed. Two raters then applied 
a six-item CQI screen to the 106 articles. 
Results: Per cent agreement ranged from 55.7% to 
75.5% for the six items, and reviewer-adjusted 
intra-class correlation ranged from 0.43 to 0.62. 
'Feedback of systematically collected data' was the 
most common feature (64%), followed by being at 
least 'somewhat' adapted to local conditions (61%), 
feedback at meetings involving participant leaders 
(46%), using an iterative development process (40%), 
being at least 'somewhat' data driven (34%), and using 
a recognised change method (28%). All six features 
were present in 14.2% of Qll articles. 
Conclusions: We conclude that CQI features can be 
extracted from Qll articles with reasonable reliability, 
but only a small proportion of Qll articles include all 
features. Further consensus development is needed to 
support meaningful use of the term CQI for scientific 
communication. 



INTRODUCTION 

Continuous quality improvement (CQI) 
represents a set of methods for improving 
healthcare 1-4 that originated from industrial 



process improvement approaches. 5 6 One 
evidence review describes CQI as 'a philos- 
ophy of continual improvement of the 
processes associated with providing a good or 
service that meets or exceeds customer 
expectations'. Although a useful starting 
point, this definition has not emerged from 
formal consensus processes, has not been 
tested for reliability, and may therefore be 
difficult to operationalise in evidence 
syntheses. Greater consensus on key features 
of CQI that could be reliably operationalised 
would improve the reporting, cataloguing, 
and systematic review of CQI interventions. 

We acknowledge that meanings fluctuate 
over time. 8 9 The term CQI has a complex 
heritage from use in both industry and 
healthcare, and seeking to create a normative 
definition may perturb this evolution. 9 Science, 
however, depends upon clear word usage for 
communication, and efforts to understand 
scientific meaning have often promoted 
scientific development in both clinical 10-12 
and methodological 13-19 domains. In the work 
presented here, we aimed to understand the 
current usage of the term CQI as a step 
towards improving scientific communication in 
the quality improvement field. 

This work is part of the 'Advancing the 
Science of Continuous Quality Improvement' 
(ASCQI) Program funded by the Robert 
Wood Johnson Foundation, a US-based 
healthcare-oriented philanthropic organisa- 
tion. One ASCQI aim was to 'develop 
methods, tools and standards for the design, 
conduct and reporting of CQI research and 
evaluations, including standardised typolo- 
gies, definitions and measures of key 
concepts and consensus statements'. 20 
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Towards that aim, this study developed a screen for 
CQI features, tested it for reliability, and applied it to 
electronically identified quality improvement interven- 
tion (QII) articles to assess which key CQI features are 
most commonly present in today's quality improvement 
literature. 

METHODS 

Overview 

We first elicited a broad range of existing definitions for 
CQI, distilled them into candidate key features, and 
engaged an expert panel to rate and refine the features. 
We then used a previously published QII definition 21 as 
the basis for a QII screening form and applied it to 
articles from a broad electronic search. Finally, we 
operationalised the highest-scoring consensus-based 
CQI features as an assessment form and applied it to the 
QII article set. 

Identification of potential key features of CQI 

To identify key features of CQI, we conducted a simpli- 
fied, sequential group consensus process, similar to 
a repeated focus group with feedback. We organised a 
12-member expert panel, intentionally encompassing 
a diverse range of methodological perspectives repre- 
senting both quality improvement and research exper- 
tise. Individual experts included process consultants, 



researchers and institutional decisionmakers from both 
the USA and the UK; several additionally serve as editors 
of clinical or quality improvement journals (see 
'Acknowledgements' for complete list). To begin 
generating potentially definitional features of CQI, 
Robert Wood Johnson Foundation staff reviewed 
grant applications to the ASCQI Program and abstracted 
48 phrases used by applicants to define 'CQF. 
Two authors (LR, SH) independently reviewed these 
phrases to ascertain common themes, reconceptualised 
them as a list of unique, potentially definitional 
features, and then met to discuss and reach agreement 
on the list. 

The expert panel then completed an online survey of 
the features, reviewed survey results, and discussed the 
results on two conference calls. The survey asked, for 
each feature: Ts this feature necessary (definitional) for 
CQI?' (5=definitely; 4=probably; 3=no difference; 
2=probably not; l=definitely not). The survey and 
discussion process enabled the addition of features to 
the original set from other sources, as suggested by the 
panel or research team. 20-29 Table 1 lists 12 features 
(A— L) finalised when the process had ceased generating 
additional potential features. Panelists rated the 12 
features again at a final in-person meeting, resulting in 
six features (A, C, D, E, G, K) rated as 'definitely' or 
'probably' necessary (definitional) for CQI (median 
value >4.0). The final column in table 1 shows which 



Table 1 


Potentially definitional continuous quality improvement (CQI) features 










ltem(s) on 






'Definitely' or 'probably' 


CQI features 


Feature 


Description 


definitional for CQI 


assessment form 


A 


The intervention involves an iterative development and 


X 


CQI-1, CQI-5 




testing process such as PDSA (Plan-Do-Study-Act) 






B 


The intervention is designed and/or carried out by teams 






C 


The intervention uses systematic data-guided activities to 


X 


CQI-3, CQI-5 




achieve improvement 






D 


The intervention involves feedback of data to intervention 


X 


CQI-2, CQI-5 




designers and/or implementers 






E 


The intervention aims to change how care is organised, 


X 


QII-4 




structured, or designed 






F 


The intervention aims to change the daily work or routine 








within an organisation 






G 


The intervention identifies one or more specific methods 


X 


CQI-4 




(eg, change strategies) aimed at producing improvement 






H 


The intervention aims to redesign work processes 






I 


The intervention uses available previously established 








evidence relevant to the target Ql problem or goal 






J 


The intervention seeks to create a culture or mindset of 








quality improvement 






K 


The intervention is designed/implemented with local 


X 


CQI-6 




conditions in mind 






L 


The intervention is shaped by clearly defined desired 








outcomes/targets 
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items on the final CQI features assessment form 
reflected each 'definitely' or 'probably' definitional 
feature. 

Criteria for identifying Qll studies 

We focused on QII studies that, as described previously, 21 
addressed effectiveness, impacts, or success; qualitative, 
quantitative and mixed-methods studies were all consid- 
ered eligible. Our QII screening form identified articles 
that: (QII-1) reported on an intervention implemented 
in or by a healthcare delivery organisation or organisa- 
tional unit; (QII-2) reported qualitative or quantitative 
data on intervention effectiveness, impacts, or success; 
(QII-3) reported on patient (or care giver) health 
outcomes; and (QII-4) aimed to change how delivery of 
care was routinely structured within a specific organisa- 
tion or organisational unit. All four QII criteria had to be 
present according to two independent reviewers for an 
article to be included. We thus excluded studies that only 
reported cost or provider knowledge/attitude 
measures. 21 The fourth criterion, QII-4, conceptually 
overlaps with potential CQI feature E ('The intervention 
aims to change how care is organised, structured, or 
designed') as identified by our CQI expert panel. 
Because we selected articles for our QII sample based on 
this criterion, 100% of studied articles had this feature. 

Criteria for assessing CQI features 

With feature E already part of the QII screening form, we 
then incorporated the five remaining 'definitely' or 
'probably' definitional CQI features into a six-item 
assessment form (table 2), and refined the form and its 
guidelines through pilot testing. These five features were 
CQI-1 ('The intervention involves an iterative develop- 
ment and testing process such as PDSA (Plan-Do-Study- 
Act) '), CQI-2 ('The intervention involves feedback of data 
to intervention designers and/or implementers'), CQI-3 
('The intervention uses systematic data-guided activities 
to achieve improvement'), CQI-4 ('The intervention 
identifies one or more specific methods (eg, change 
strategies) aimed at producing improvement') and CQI-6 
('The intervention is designed/implemented with local 
conditions in mind'). The concept of being 'data driven' 
was a consistent theme of the expert panel discussions, 
manifested through items CQI-1 , CQI-2 and CQI-3, which 
all reflect data use but do not use the term 'data driven'. 
We added CQI-5 as a potentially more direct assessment. 

We used a three-point scale with explicit criteria for all 
items during pilot testing. However, CQI-5 (data driven) 
and CQI-6 (designed for local conditions) were not reli- 
able in this form. We therefore used a five-point implicit 
(reviewer judgement-oriented) review scale for these two 
items. Pilot testing showed better reliability for the three- 
point and five-point scales than simple yes/no responses. 



Exploratory items 

To further enhance our understanding of the QII and 
CQI literature, we collected additional information on 
reviewed articles. We assessed setting, evaluation target 
(ie, change package, change method, or both), evalua- 
tion design (ie, presence/absence of a comparison 
group), researcher involvement in authorship, results 
(ie, whether the intervention demonstrated positive 
effects), and journal type (to explore potential differ- 
ences in reporting across publication venues). 'Change 
package' describes the set of specific changes for 
improving care (reminders, tools, or other care model or 
prevention elements) implemented in a QII, while 
'change method' describes the approach used to intro- 
duce and implement the change package (eg, CQI, 
Lean, Six-Sigma, Reengineering) . For assessing journal 
type, we characterised journals as clinical (general, 
nursing, or specialty) or quality improvement/ health 
services research. 

QII sample identification and screening 

To reflect usual methods for evidence review, we began 
with electronically searched articles. We developed search 
strategies for the MEDLINE (Ovid) and PubMed data- 
bases based on free text words, medical subject headings, 
QI intervention components, CQI methods, and combi- 
nations of the strategies (Hempel et al, submitted). 
Searches included a broad range of terms ('quality' 
AND 'improv*' AND 'intervention*') indicating quality 
improvement in general, as well as the following 
CQI-related terms: 'Plan-Do-Study-Act', 'Plan-Do-Check- 
Act', 'Define-Measure-Analyse-Improve-Control', 'Define- 
Measure-Analyse-Design-Verify', 'iterative cycle', Deming, 
Taguchi, Kansei, Kaizen, 'six-sigma', 'total quality 
management', 'quality function deployment', 'House 
of quality', 'quality circle', 'quality circles', 'Toyota 
production system', 'lean manufacturing' and 'business 
process reengineering'. The search resulted in 9427 
articles. 

To identify candidate QII articles from this set, two 
authors (LR, PS) used previously described definitions 22 
to identify 201 potentially relevant titles and abstracts 
reporting empirical data on a QII from among 1600 
randomly selected articles. We then screened the 
remainder of the 9427 articles using an experimental 
machine learning algorithm that utilised the manual 
title/abstract review as a learning set. We added 49 
machine-screened articles that screened in at a maximal 
confidence level. Finally, we added 24 articles recom- 
mended by expert panel members as QII examplars, 
resulting in a total of 272 candidates. 

We identified QII articles from among these 272 using 
the QII screening form with the explicit criteria 
discussed above (QII-1 through QII-4). 21 Two reviewers 
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Table 2 Continuous quality improvement (CQI) features assessment for articles identified as studies of quality improvement 
interventions (Qlls) 



CQI-1 



CQI-2 



CQI-3 



CQI-4 



CQI-5 



CQI-6 



ITERATIVE DEVELOPMENT PROCESS: Did the improvement initiative involve iterative 
design AND implementation of a set of specific changes for improving care (ie, a change 
package)? 

Iterative Development: Cyclical process, such as Plan-Do-Study-Act cycles, in which the initial design and implementation of the set 
of changes for improving care is followed by redesign and ^implementation. A single iterative cycle thus includes initial 
implementation followed by assessment, redesign, and ^implementation. 

1) 0 cycles 

2) 1 or unclear # (>0) of complete cycles 

3) 2 or more complete cycles 



FEEDBACK AT MEETINGS INVOLVING PARTICIPANT LEADERS: Did leaders of the 
improvement initiative (eg, local managers, clinical leaders, central experts, or improvement 
teams) from participating study organisation(s) or local study site(s) meet to review 
information on initiative implementation? 

Information on Implementation Progress: Includes formal feedback, review of interim outcomes, and/or informal 
discussions relating to progress on the introduction of a set of changes for improving care (ie, the change 
package) into organisation (s) or site(s). 

Feedback Meetings Involving Participant Leaders: Can be by telephone or in person, but not by paper or e-mail 
only (ie, must provide opportunity for interaction). Must include organisation or site leaders and not researchers 
alone. 

1) No /Don't Know 

2) Participant leader meetings, but unclear if improvement initiative implementation discussed 

3) Yes, participant leader meetings where improvement initiative implementation discussed 



FEEDBACK OF SYSTEMATICALLY-COLLECTED DATA: Did the improvement 
initiative include feedback of systematically-collected data on implementation? 

Systematically-collected data: Quantitative or qualitative data, collected according to a design or plan or for 
which methods are specified in the article. Exclude information produced at a meeting at which random 
individuals discuss problems. Include only data collected during implementation of a set of changes for improving 
care (ie, change package). 

1) No /Don't Know 

2) Feedback of systemically-collected data on implementation — only one data point 

3) Yes, feedback of systematically-collected data on implementation — multiple data points 



RECOGNIZED CHANGE METHOD: Were one or more recognised change methods used in 
the improvement initiative? 

System change methods such as the following: 'CQI, Continuous Quality Improvement'; DMAIC, Define- 
Measure-Analyse-lmprove-Control; DMADV. Define-Measure-Analyse-Design-Verify; approaches of 
Deming,Taguchi, Kaizen, Juran, or Kansei; 'six-sigma'; 'total quality management' ; 'quality function 
deployment'; 'House of quality' ; 'quality circle' ; 'Toyota production system' ; 'lean manufacturing' ; 'business 
process reengineering' ; CRM, 'crew resource management'; 'Breakthrough Series'; 'Institute for Healthcare 
Improvement' quality improvement; Evidence-based Quality Improvement. 
Specify other terms used: 

1) No /Don't Know 

2) Change method mentioned but not explicitly described 

3) Yes, specific elements of the change method explicitly described 



DATA-DRIVEN: In your judgment, to what extent was the design AND/OR implementation 
of a set of changes for improving care (ie, change package) driven by data collected 
systematically during implementation? 

Local conditions: Barriers, resources, or baseline characteristics of the organisation(s) or local site(s) that could 
influence outcomes or progress (eg, costs, capabilities, leadership, Ql experience; patient population; or 
provider characteristics). 

Very Little 

or Not At All Somewhat Substantially 

(1) (2) (3) (4) (5) 



LOCAL CONDITIONS: In your judgment, to what extent were local conditions at study 
organisation (s) or site(s) taken into account in the design AND/OR implementation of the set 
of specific changes for improving care (ie, the change package)? 

Very Little 

or Not At All Somewhat Substantially 

(1) (2) (3) (4) (5) 
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(MD plus SH or SO) reviewed the full text of each 
candidate article to apply the screen, and consulted LR 
for resolution when there was disagreement. 

Assessment of CQI features 

Two reviewers (YL, RF) pilot tested the initial CQI 
features assessment on a subset of 45 included QIIs. Two 
reviewers (YL, SO) applied the final CQI features 
assessment to the remaining 106 included QIIs. 

Analysis 

In calculating consensus results, we adjusted for reviewer 
effect. Some reviewers consistently rate items lower on 
a scale (ie, the mean, or midpoint, around which their 
ratings vary is lower) and some reviewers rate consis- 
tently higher. 30 Reviewer effect adjustment normalises 
raters to a common mean. We computed the inter-rater 
reliabilities of the QII and CQI features assessment using 
K statistics (for bivariate assessments) and intra-class 
correlations (for scales). 

We counted a CQI feature 'present' in an article if 
both reviewers rated that feature as >2 (on a three-point 
scale) or >3 (on a five-point scale). We weighted items 
equally and did not prespecify a cut-off for qualifying 
a study as 'CQI.' However, to explore potential cut-off 
points, we created a composite rating by averaging across 
all CQI features for each article. We applied cut-offs by 
using the average composite rating across both 
reviewers, as well as by requiring both reviewers' 
composite ratings to independently surpass the cut-off. 
For composite ratings, we analysed results both with and 
without items CQI-5 and CQI-6 to account for the use of 
a five-point scale. 

RESULTS 

QII screen results 

QII screening resulted in 151 included QII articles. 
Inter-rater per cent agreement for application of the 
explicit screening form (prior to resolution of disagree- 
ments) was 85.7% (K=0.7l). The final inclusion set 
comprised 106 QIIs. Table 3 shows that most reported 
QIIs were hospital or outpatient based (56% and 33% 
respectively). Most studies (77%) reported no compar- 
ison group and 83% reported improvements following 
interventions. About half of the articles involved 
an author who had a PhD or master's degree; 
10% indicated an academic professorial type position. 
Articles appeared predominantly in clinical journals 
(64%). 

CQI features assessment 

Table 4 shows inter-rater reliability (intra-class correla- 
tion) and per cent agreement between reviewers for CQI 



Table 3 Descriptive characteristics of quality 
improvement interventions (QIIs) 



Characteristic 



QIIs (n = 106) 
n (%) 



Setting 

Hospital 59 (56) 

Outpatient 35 (33) 

Long-term care 9 (9) 

Other 3 (3) 

Don't know 0 (0) 

Evaluation target 

Change package 101 (95) 

Change method 2 (2) 

Both 1 (1) 

Other 0 (0) 

Don't know 2 (2) 

Evaluation design 

No comparison group/don't 82 (77) 
know 

Randomly assigned 12(11) 
comparison group 

Non-randomly assigned 12 (11) 
comparison group 

Researcher involvement in authorship 

Professor 11 (10) 

PhD 26 (25) 

Master's trained 26 (25) 

Other 0 (0) 

No/don't know 43 (41) 

Results 

Reported as showing 88 (83) 
improvement 

Reported as equivocal 11 (10) 

Reported as NOT showing 6 (6) 
improvement 

No/don't know 1 (1) 
Journal type 

Quality improvement/health 38 (36) 
services research 

Clinical 68 (64) 

General 10(9) 

Nursing 17 (16) 

Other specialty 41 (39) 



features. Per cent agreement ranged from 55.7% to 
75.5% for the six items, and reviewer-adjusted intra-class 
correlations ranged from 0.43 to 0.62 (in the 'fair to 
good' reliability range). 

Among features, feedback of systematically collected 
data was the most common (64%), followed by being at 
least 'somewhat' adapted to local conditions (61%), 
feedback at meetings involving participant leaders 
(46%), using an iterative development process (40%), 
being at least 'somewhat' data driven (34%), and using 
a recognised change method (28%). Articles in quality 
improvement or health services research journals 
reported all CQI features more often than clinical 
journals, significantly more for two features, feedback 
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of systematically collected data and being at least 
'somewhat' data driven. 

Table 5 shows that 14% of articles included all six CQI 
features at a score of two or more (see table 2 for 
scoring). Table 6 shows another approach to assessing 
cut-offs for considering an article to represent CQI 
methods. This approach uses a composite rating of 'CQI- 
ness' based on the average score across all features for 
each article. Based on achieving a composite score of two 
or more, 44% of QII articles showed some level of 
CQI-ness, and could be so identified with a K for reliability 
of 0.49 (fair reliability). Depending on the cut-off value 
used, the number of interventions in our QII sample 
qualifying as CQI interventions ranged from 1% to 44%. 

DISCUSSION 

This project used expert consensus methods to develop 
and apply potential CQI definitional features to 
a comprehensive sample of QII literature. We found 
reasonable inter-rater reliability for applying consensus- 
based features to electronically identified candidate QII 
articles. This indicates that systematic sample identifica- 
tion of CQI intervention articles is feasible. We found 
considerable variation in the reporting of individual 
features. 

We aimed to assess the feasibility of creating 
a consensus-based definition of CQI for evidence review. 
We found that while experts could agree on a core set of 
important features, and these features could be reliably 
applied to literature, few articles contained a consistent 
core set. Alternatively, we tested a composite measure of 
'CQI-ness' that reflected the quantity of CQI features 
reported. We found that this approach was feasible and 
may be useful for review purposes. This approach has 
important limitations, however, in that specific features 
may be of varying relevance depending on the purpose 
of the review. 

As an illustration of the diversity of articles with CQI 
features, only one article was maximally rated by both 
reviewers on all features. Nowhere in that article does 
the phrase 'CQI' or even 'quality improvement' appear, 
which shows the disjunct between reporting of CQI 
features and use of the term 'CQI' itself. 

During review, we noted that QII articles were incon- 
sistently organised, with important methodological 
information about the intervention scattered 
throughout the sections of the articles. For iterative 
processes and data feedback in particular (CQI-1, CQI-2, 
and CQI-3), reviewers often had to extract data from 
tables (eg, monthly infection rates) rather than the main 
text. Development of a standard order for reporting CQI 
methods and results might make CQI articles easier to 
write and review. 
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Table 5 Quality improvement intervention (Qll) articles, stratified by the number of continuous quality improvement (CQI) 
features present 



Qll articles (n = 106) 



No. of CQI features present* (of features 1-6) 



Cut-off: feature ratings >2 
n (%) 



Cut-off: feature ratings = 3 
n (%) 



0 of 6 features 

1 of 6 features 

2 of 6 features 

3 of 6 features 

4 of 6 features 

5 of 6 features 

6 of 6 features 

No. of CQI features present* (of features 1- 

0 of 4 features 

1 of 4 features 

2 of 4 features 

3 of 4 features 

4 of 4 features 



-4) 



15 (14) 
24 (23) 
15 (14) 

14 (13) 
11 (10) 
12(11) 

15 (14) 



28 (26) 
21 (20) 
23 (22) 
14 (13) 
20 (19) 



47 (44) 
21 (20) 
16 (15) 
10(9) 
6(6) 
5(5) 
1 0) 



50 (47) 
26 (25) 
21 (20) 
7(7) 
2(2) 



*'Present' implies both reviewer ratings were greater than or equal to the indicated cut-off. For items CQI-5 and CQI-6, ratings were collapsed to 
a three-point scale (from the original five-point scale). 



Two items, data-drivenness (CQI-5) and degree of 
adaptation to local conditions (CQI-6), required 
implicit reviewer judgement due to our inability to 
develop reliable explicit criteria for assessing them. 
Some articles, for example, implied data-drivenness by 
alluding to quantitative audit/feedback mechanisms 
employed during implementation, but did not display 



any data. Multisite trials of standardised change pack- 
ages, as another example, might imply methods for local 
involvement, but describe local adaptations only 
vaguely. 

An earlier CQI evidence review 7 also identified the 
issue of variable language use and reporting. Efforts to 
standardise reporting for randomised controlled 



Table 6 Quality improvement interventions articles, stratified by composite rating over all continuous quality improvement 
(CQI) features 

No. of articles 



Average composite rating Independent composite 

above cut-off* ratings above cut-off t 



Article composite rating cut-off £ 


n (%) 


n (%) 


K 


CQI-1 through CQI-6 








>2.00 


47 (44) 


37 (35) 


0.49 


>2.25 


37 (35) 


25 (24) 


0.59 


>2.50 


23 (22) 


19 (18) 


0.47 


>2.75 


11 (10) 


7(7) 


0.38 


=3.00 


1 0) 


1 0) 


0.11 


CQI-1 through CQI-4 only 








>2.00 


47 (44) 


40 (38) 


0.44 


>2.25 


41 (39) 


30 (28) 


0.58 


>2.50 


25 (24) 


17(16) 


0.45 


>2.75 


10(9) 


10(9) 


0.41 


= 3.00 


2(2) 


2(2) 


0.21 


Calculated for each article by taking the average of both reviewers' ratings for each item, and then taking the average over all items. For an 
article to count, the average composite rating had to surpass the indicated cut-off value. 

fCalculated for each article, separately for each reviewer, by taking the average rating over all items. For an article to count, both reviewers' 
independent composite ratings had to surpass the indicated cut-off value. 

^Composite ratings could range from 1.00 to 3.00. For items CQI-5 and CQI-6, ratings were collapsed to a three-point scale (from the original 
five-point scale). 
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trials 13-15 and QIIs 31 have proven useful. Our results 
support similar efforts for CQI interventions. 

This study has limitations. The lack of relevant medical 
subject heading terms for either QII or CQI, in addition 
to inherent variation in CQI language use, may have 
reduced search sensitivity. To address this limitation, we 
used an inclusive electronic search strategy (Hempel 
et al, submitted) and additional expert referral of arti- 
cles. This in turn resulted in a large candidate article 
set that required substantial screening. The number of 
electronically generated articles, however, is within the 
range of major evidence reviews. 32-34 We further expect 
that studies may most likely apply our methods to 
smaller sets addressing CQI subtopics, such as CQI 
for diabetes. The expert panel portion of this study is 
limited by involvement of a small though diverse 
group of key stakeholders. The purpose of the study, 
however, was to clarify and describe variations in 
reporting of key CQI features rather than to propose 
a final definition. 

Currently, given the low agreement on the meaning of 
the term 'CQF, readers can have very little confidence 
that reviews of CQI interventions will include coherent 
samples of the literature. Without explicit identification 
of specific CQI features, reviews will yield unin- 
terpretable results. Continued work assessing CQI 
features in relevant literature will result in more effi- 
cient, effective learning about this important quality 
improvement approach. Meanwhile, the more explicit 
CQI authors can be in describing the key features of 
their CQI interventions, 31 the more interpretable and 
useful the results of their work will be. 
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